Prior works on Information Extraction (IE) typically predict different tasks and instances (e.g., event triggers, entities, roles, relations) independently, while neglecting their interactions and leading to model inefficiency. In this work, we introduce a joint IE framework, HighIE, that learns and predicts multiple IE tasks by integrating high-order cross-task and cross-instance dependencies. Specifically, we design two categories of high-order factors: homogeneous factors and heterogeneous factors. Then, these factors are utilized to jointly predict labels of all instances. To address the intractability problem of exact high-order inference, we incorporate a high-order neural decoder that is unfolded from a mean-field variational inference method. The experimental results show that our approach achieves consistent improvements on three IE tasks compared with our baseline and prior work.
translated by 谷歌翻译
Adversarial attacks on thermal infrared imaging expose the risk of related applications. Estimating the security of these systems is essential for safely deploying them in the real world. In many cases, realizing the attacks in the physical space requires elaborate special perturbations. These solutions are often \emph{impractical} and \emph{attention-grabbing}. To address the need for a physically practical and stealthy adversarial attack, we introduce \textsc{HotCold} Block, a novel physical attack for infrared detectors that hide persons utilizing the wearable Warming Paste and Cooling Paste. By attaching these readily available temperature-controlled materials to the body, \textsc{HotCold} Block evades human eyes efficiently. Moreover, unlike existing methods that build adversarial patches with complex texture and structure features, \textsc{HotCold} Block utilizes an SSP-oriented adversarial optimization algorithm that enables attacks with pure color blocks and explores the influence of size, shape, and position on attack performance. Extensive experimental results in both digital and physical environments demonstrate the performance of our proposed \textsc{HotCold} Block. \emph{Code is available: \textcolor{magenta}{https://github.com/weihui1308/HOTCOLDBlock}}.
translated by 谷歌翻译
Few-shot learning (FSL), which aims to classify unseen classes with few samples, is challenging due to data scarcity. Although various generative methods have been explored for FSL, the entangled generation process of these methods exacerbates the distribution shift in FSL, thus greatly limiting the quality of generated samples. To these challenges, we propose a novel Information Bottleneck (IB) based Disentangled Generation Framework for FSL, termed as DisGenIB, that can simultaneously guarantee the discrimination and diversity of generated samples. Specifically, we formulate a novel framework with information bottleneck that applies for both disentangled representation learning and sample generation. Different from existing IB-based methods that can hardly exploit priors, we demonstrate our DisGenIB can effectively utilize priors to further facilitate disentanglement. We further prove in theory that some previous generative and disentanglement methods are special cases of our DisGenIB, which demonstrates the generality of the proposed DisGenIB. Extensive experiments on challenging FSL benchmarks confirm the effectiveness and superiority of DisGenIB, together with the validity of our theoretical analyses. Our codes will be open-source upon acceptance.
translated by 谷歌翻译
The 1$^{\text{st}}$ Workshop on Maritime Computer Vision (MaCVi) 2023 focused on maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicle (USV), and organized several subchallenges in this domain: (i) UAV-based Maritime Object Detection, (ii) UAV-based Maritime Object Tracking, (iii) USV-based Maritime Obstacle Segmentation and (iv) USV-based Maritime Obstacle Detection. The subchallenges were based on the SeaDronesSee and MODS benchmarks. This report summarizes the main findings of the individual subchallenges and introduces a new benchmark, called SeaDronesSee Object Detection v2, which extends the previous benchmark by including more classes and footage. We provide statistical and qualitative analyses, and assess trends in the best-performing methodologies of over 130 submissions. The methods are summarized in the appendix. The datasets, evaluation code and the leaderboard are publicly available at https://seadronessee.cs.uni-tuebingen.de/macvi.
translated by 谷歌翻译
Video super-resolution is one of the most popular tasks on mobile devices, being widely used for an automatic improvement of low-bitrate and low-resolution video streams. While numerous solutions have been proposed for this problem, they are usually quite computationally demanding, demonstrating low FPS rates and power efficiency on mobile devices. In this Mobile AI challenge, we address this problem and propose the participants to design an end-to-end real-time video super-resolution solution for mobile NPUs optimized for low energy consumption. The participants were provided with the REDS training dataset containing video sequences for a 4X video upscaling task. The runtime and power efficiency of all models was evaluated on the powerful MediaTek Dimensity 9000 platform with a dedicated AI processing unit capable of accelerating floating-point and quantized neural networks. All proposed solutions are fully compatible with the above NPU, demonstrating an up to 500 FPS rate and 0.2 [Watt / 30 FPS] power consumption. A detailed description of all models developed in the challenge is provided in this paper.
translated by 谷歌翻译
Although Deep Neural Networks (DNNs) have achieved impressive results in computer vision, their exposed vulnerability to adversarial attacks remains a serious concern. A series of works has shown that by adding elaborate perturbations to images, DNNs could have catastrophic degradation in performance metrics. And this phenomenon does not only exist in the digital space but also in the physical space. Therefore, estimating the security of these DNNs-based systems is critical for safely deploying them in the real world, especially for security-critical applications, e.g., autonomous cars, video surveillance, and medical diagnosis. In this paper, we focus on physical adversarial attacks and provide a comprehensive survey of over 150 existing papers. We first clarify the concept of the physical adversarial attack and analyze its characteristics. Then, we define the adversarial medium, essential to perform attacks in the physical world. Next, we present the physical adversarial attack methods in task order: classification, detection, and re-identification, and introduce their performance in solving the trilemma: effectiveness, stealthiness, and robustness. In the end, we discuss the current challenges and potential future directions.
translated by 谷歌翻译
与多标签学习相反,标签分布学习通过标签分布来表征示例的多义,以代表更丰富的语义。在标签分布的学习过程中,培训数据主要是通过手动注释或标签增强算法来生成标签分布的。不幸的是,手动注释任务的复杂性或标签增强算法的不准确性导致标签分布训练集中的噪声和不确定性。为了减轻此问题,我们在标签分布学习框架中介绍了隐式分布,以表征每个标签值的不确定性。具体而言,我们使用深层隐式表示学习来构建具有高斯先前约束的标签分布矩阵,其中每个行组件对应于每个标签值的分布估计,并且该行组件受到先验的高斯分布来限制以调节噪声和不确定性标签分布数据集的干扰。最后,通过使用自我注意力算法将标签分布矩阵的每个行分量转换为标准标签分布形式。此外,在训练阶段进行了一些具有正则化特征的方法,以提高模型的性能。
translated by 谷歌翻译
点对特征(PPF)广泛用于6D姿势估计。在本文中,我们提出了一种基于PPF框架的有效的6D姿势估计方法。我们介绍了一个目标良好的下采样策略,该策略更多地集中在边缘区域,以有效地提取复杂的几何形状。提出了一种姿势假设验证方法来通过计算边缘匹配度来解决对称歧义。我们对两个具有挑战性的数据集和一个现实世界中收集的数据集进行评估,这证明了我们方法对姿势估计几何复杂,遮挡,对称对象的优越性。我们通过将其应用于模拟穿刺来进一步验证我们的方法。
translated by 谷歌翻译
对使用基于深度学习的方法来实现正电子发射断层扫描(PET CT)扫描中的病变的完全自动分割的研究兴趣越来越多,以实现各种癌症的预后。医学图像细分的最新进展表明,NNUNET对于各种任务是可行的。但是,PET图像中的病变分割并不直接,因为病变和生理摄取具有相似的分布模式。它们的区别需要CT图像中的额外结构信息。本文引入了一种基于NNUNET的病变分割任务的方法。提出的模型是根据关节2D和3D NNUNET结构设计的,以预测整个身体的病变。它允许对潜在病变的自动分割。我们在AUTOPET挑战的背景下评估了所提出的方法,该方法衡量了骰子评分指标,假阳性体积和假阴性体积的病变分割性能。
translated by 谷歌翻译
在本文中,我们重新审视了从单线图中自动重建3D对象的长期问题。以前的基于优化的方法可以生成紧凑而准确的3D模型,但是它们的成功率在很大程度上取决于(i)确定一组真正的真正几何约束的能力,以及(ii)为数值优化选择一个良好的初始值。鉴于这些挑战,我们建议训练深层神经网络,以检测3D对象中几何实体(即边缘)之间的成对关系,并预测顶点的初始深度值。我们在大型CAD模型数据集上进行的实验表明,通过利用几何约束解决管道中的深度学习,基于优化的3D重建的成功率可以显着提高。
translated by 谷歌翻译